" Input: { dict - format prepared for predictive analytics { ("dict") - add to meta of the entry (useful for subview_uoa, for example) ("meta") - coarse grain meta information to distinct entries (species) ("tags") - tags (separated by comma) ("subtags") - subtags to write to a point ("dependencies") - (resolved) dependencies ("choices") - choices (for example, optimizations) ("features") - species features in points inside entries (mostly unchanged) (may contain state, such as frequency or cache/bus contentions, etc) "characteristics" - (dict) species characteristics in points inside entries (measured) or "characteristics_list" - (list) adding multiple experiments at the same time Note: at the end, we only keep characteristics_list and append characteristics to this list... Note, that if a string starts with @@, it should be of format "@@float_value1,float_value2,... and will be converted into list of values which will be statistically processed as one dimension in time (needed to deal properly with bencmarks like slambench which report kernel times for all frames) (pipeline_state) - final state of the pipeline { 'repetitions': 'fail_reason' 'fail' 'fail_bool' } (choices_desc) - choices descrpition (features_desc) - features description (characteristics_desc) - characteristic description (pipeline) - (dict) if experiment from pipeline, record it to be able to reproduce/replay (pipeline_uoa) - if experiment comes from CK pipeline (from some repo), record UOA (pipeline_uid) - if experiment comes from CK pipeline (from some repo), record UOA (to be able to reproduce experiments, test other choices and improve pipeline by the community/workgroups) (dict_to_compare) - flat dict to calculate improvements } (experiment_repo_uoa) - if defined, use it instead of repo_uoa (useful for remote repositories) (remote_repo_uoa) - if remote access, use this as a remote repo UOA (experiment_uoa) - if entry with aggregated experiments is already known (experiment_uid) - if entry with aggregated experiments is already known (force_new_entry) - if 'yes', do not search for existing entry, but add a new one! (search_point_by_features) - if 'yes', find subpoint by features (features_keys_to_process) - list of keys for features (and choices) to process/search (can be wildcards) by default ['##features#*', '##choices#*', '##choices_order#*'] (ignore_update) - if 'yes', do not record update control info (date, user, etc) (sort_keys) - if 'yes', sort keys in output json (skip_flatten) - if 'yes', skip flattening and analyzing data (including stat analysis) ... (skip_stat_analysis) - if 'yes', just flatten array and add #min (process_multi_keys) - list of keys (starts with) to perform stat analysis on flat array, by default ['##characteristics#*', '##features#*' '##choices#*'], if empty, no stat analysis (record_all_subpoints) - if 'yes', record all subpoints (i.e. do not search and reuse existing points by features) (max_range_percent_threshold) - (float) if set, record all subpoints where max_range_percent exceeds this threshold useful, to avoid recording too many similar points, but only *unusual* ... (record_desc_at_each_point) - if 'yes', record descriptions for each point and not just an entry. Useful if descriptions change at each point (say checking all compilers for 1 benchmark in one entry - then compiler flags will be changing) (record_deps_at_each_point) - if 'yes', record dependencies for each point and not just an entry. Useful if descriptions change at each point (say different program may require different libs) (record_permanent) - if 'yes', mark as permanent (to avoid being deleted by Pareto filter) (skip_record_pipeline) - if 'yes', do not record pipeline (to avoid saving too much stuff during crowd-tuning) (skip_record_desc) - if 'yes', do not record desc (to avoid saving too much stuff during crowd-tuning) } Output: { return - return code = 0, if successful > 0, if error (error) - error text if return > 0 update_dict - dict after updating entry dict_flat - flat dict with stat analysis (if performed) stat_analysis - whole output of stat analysis (with warnings) flat_features - flat dict of real features of the recorded point (can be later used to search the same points) recorded_uid - UID of a recorded experiment point - recorded point sub_point - recorded subpoint elapsed_time - elapsed time (useful for debugging - to speed up processing of "big data" ;) ) } "